Skip to main content

Full text of "Improving the discoverability of scholarly content in the twenty-first century : collaboration opportunities for librarians, publishers, and vendors"

See other formats


Improving the Discoverability of Scholarly 
Content in the Twenty-First Century 

Collaboration Opportunities for Librarians, Publishers, and Vendors 

Mary M. Somerville, University Librarian/Director and Professor, University of 
Colorado Denver, Auraria Library, Denver, Colorado 

Barbara J. Schader, Associate University Librarian for Collections and Scholarly 
Communication, University of California Riverside, Rivera Library, Riverside, 

John R. Sack, Associate Publisher and Director, HighWire Press, Stanford 
University Libraries and Academic Information Resources, Palo Alto, California 

A White Paper Commissioned by SAGE 

Disclaimer: This white paper was supported by SAGE in an effort to 
contribute further to the conversation and debate around discoverability. 
It does not necessarily reflect the views or policies of SAGE. 


Discoverability is a popular buzzword — ultimately 
meaning the degree to whicin scinolars can locate the 
content needed to advance their research and other 
creative activity. Improved user discovery experiences 
require heightened collaboration among (1) scholarly 
publishers and their published authors; (2) search 
engine developers, database providers, abstracting 
and indexing services, and academic publishers; 
(3) electronic resource management and integrated 
library system vendors; and (4) librarians who advance 
institutional discoverability. Drawing from interviews 
with value chain experts, results of research studies, 
and insights from scholarly literature, this white paper 
assesses the currently fragmented discovery environment 
and proposes cross-sector conversations to further 
visibility and, ultimately, usage of the scholarly corpus, 
not only on the open web, but within library services. 

Discoverability: Concept Introduction 

Researchers should have the best of all worlds: discovery 
acceleration tools in familiar web environments, the 
power of detailed indexing to produce highly relevant and 
precise search results, and seamless identification and 
fulfillment experiences. Achieving such ambitions requires 
purposeful conversations among contributors to the value 
chain for scholarship production and dissemination. Four 
main parties are involved in creating and/or consuming 
scholarly content: scholars, who produce the work and 
are its ultimate consumers; editors (often faculty), who 
act as the bridge between scholars and publishers by 
shaping the vision of academic works, managing peer 
review, and ensuring content acquisition; publishers, who 
curate, refine, disseminate, and promote scholarly works; 
and subscribers, largely institutions, who purchase, 
lease, or access the corpus. Traditional scholarly values 
fortify and sustain these long-standing relationships 
despite transformative forces that have irrevocably 
altered the established knowledge generation landscape. 
Discoverability has been particularly transformed, as end 
users employ a growing range of navigation strategies- 
demonstrated by web log analytics that calculate the sites 
from which users of scholarly resources were referred 
and studies that report where users started their research 
before arriving at content websites, among other points of 
evidence. To optimize this complex discovery value chain, 
libraries' vendors (bibliographic data services, content 

aggregators, and technology providers), publishers' 
vendors (printers, platform hosts, content architects, and 
technology providers), and search engine providers must 
initiate forward-thinking conversations. 

Content Providers 

Service Providers 


Therefore, this white paper, sponsored by SAGE, aims 
to deepen collective dialogue about and reflection on 
the optimum discovery of scholarly publications and 
authoritative information today. Such conversations must 
necessarily consider a wider range of topics— library 
discovery tools, web discovery services, publisher 
tutorial services, and library research pages. The 
increasing presence of social media, including "the 
Googlization of everything," predicts that researcher 
behaviors will continue to evolve. As such, suggestions 
for best practices and shared solutions aspire to further 
involve (1) publishers with the authors whose interests 
they represent, (2) search engine developers with the 
publishers who provide them with scholarly content to 
index, (3) electronic resource management (ERM) service 
providers with the publishers and librarians who advance 
institutional discoverability, and (4) librarians with the 
researchers and scholars who contribute to and harvest 
from scholarly materials. These sustained relationships 
could generate actionable outcomes that harness the 
full potential of contemporary technology and human 

This proposal is timely. In recent years— amid 
accelerating, unrelenting changes that promise 
to fundamentally transform scholarly knowledge 
creation, dissemination, and research— the concept of 
discoverability has emerged as a shared concern for 
publishers, vendors, and librarians who are committed 
to enhancing the ease with which researchers can locate 
and use relevant academic material to further studies. 
Although the fourteen supply chain representatives 
interviewed for this paper had markedly different points 
of view, all agreed that improved discoverability depends 
on heightened cross-sector collaboration. Interviewees 

across the industry -from OCLC to EBSCOhost, ITHAKA, 
HighWire Press, and Serials Solutions— expressed 
tliis imperative in terms of "sinal<ing iiands," "inaving 
conversations," and "tliinl<ing togetlier" to enable robust 
knowledge exchange and generation activities and 
enduring research and publication practices. 

Discovery Concept Revisited 

In response to value chain representatives' consensus, 
this paper challenges the simplistic definition of 
discoverability as solely comprising technical search 
engine optimization methods for ensuring that content, 
whether licensed, owned, or free, is readily findable in 
the open web. Rather, as study participants agreed, 
even if you "build it" and index it, "they may not come."^ 
Therefore, the location, placement, and context of 
published material are vital to nuanced definitions of 
discoverability. As one value chain contributor observed, 
"resources, information, and data must be visible without 
having to look . . . outside your normal path, in your usual 
space."^ In other words, there are increasingly more ways 
of finding that do not necessarily start with searching, 
such as press releases from researchers' home institution, 
alerting services from journal websites, widgets to 
announce content on related sites, and discussion 
forums and blogs for disciplinary colleagues— all of which 
serve to enhance visibility and promote discovery and, 
ultimately, usage. Review of core published literature, 
including commissioned research studies supplemented 
by proprietary vendor studies, corroborated this 
observation and provided evidence that users are 
discovering scholarly content through an ever-growing 
range of pathways, thereby intensifying the need for 
cross-sector best practices and increased collaboration. 

At present, however, discoverability— including finding 
information serendipitously (i.e., information that you 
didn't even know you needed^)— is an imperfect process 
among already uncertain experiences that depend 
largely on invisible interdependencies among value chain 
contributors and users. In response, this white paper aims 
to explicate evolving interrelationships among traditional 
contributors to scholarship as well as newer participants 
providing integrated library systems, ERM systems, 
e-journal platforms, and web scale discovery services. 

The latter perspectives are not well represented in the 
professional literature, which precipitated interviews 
of industry experts from July through October 201 1 . 

Interview questions probed industry best practices and 
challenges, provoking one interviewee to quip, "I think 
the simple question to ask each of us who are a piece of 
the value chain is 'What practices would you recommend 
for the OTHER guys in the value chain?' . . . since, of 
course, we already implement best practices in our own 
part of the chain, don't we?"" This suggestion guided our 
analysis of interview content, which explores statements 
of best practices and collaboration opportunities across 
the industry, and it informed our mission to encourage 
cross-sector dialogue on improving discoverability and 
visibility of scholarly content, "whenever, wherever, and 
however,"^ with a primary focus on discovery of online 
publications and surrounding services. 

Discoverability: IHistory and Context 

Some historical background is helpful in considering 
how we arrived where we are and for the purpose of 
determining where we need to aim because, despite 
increasingly challenging organizational contexts 
exacerbated by economic uncertainty and disruptive 
technologies, "the driving missions of academic 
publishing and librarianship have not changed."^ The 
shared goal remains furthering discovery, access, and 
usage of scholarly publications and creative work. 
Similarly, the age-old process of furthering knowledge 
creation through formal and informal information 
exchange remains constant though uncertain, whereas 
conducting information-seeking and retrieval activities 
has intensified amid the proliferation of new and different 
search tools, sources, and channels,^ which confuse 
traditional signifiers of quality and authority. 

The importance of a sustainable integrated system for 
production and dissemination was anticipated as early 
as 1945 by Dr. Vannevar Bush, director of the Office of 
Research and Scientific Development, who, in his classic 
Atlantic Monthly article,^ celebrated the record of ideas, 
which catalyze knowledge generation. Bush recognized 
the importance of first selecting credible sources for "the 
record" and then the most relevant sources to advance 
disciplinary understanding. He characterized human 
thinking as associative, concluding that interrogation 
depends on robust indexing schemas that animate an 
intricate "web of trails carried by the cells of the brain." In 
establishing a sense of urgency. Bush noted, "Mendel's 
concept of the laws of genetics was lost to the world for 
a generation because his publication did not reach the 

few who were capable of grasping and extending it; and 
this sort of catastrophe is undoubtedly being repeated 
all about us, as truly significant attainments become lost 
in the mass of the inconsequential."^ This concept was 
eloquently rephrased decades later: "We have billions of 
pages indexed in Google, we need a few million good 

In this early call to action. Bush urges collaborative 
efforts to address "the massive task of making more 
accessible our bewildering store of knowledge," noting 
that professional "methods of transmitting and reviewing 
the results of research are generations old and by now 
are totally inadequate for their purpose."''^ Nearly a half 
century later, the World Wide Web was invented (in 1 990) 
and Google launched (in 1998), thereby accelerating the 
knowledge potential and complexity challenges driving 
today's need for better articulated, more collaborative 
discoverability and visibility solutions. 

Discovery Improvement Prerequisites 

In the wake of technology-driven consequences that 
disrupted scholarly publication traditions (including 
search and retrieval), significant progress in the 
past twenty years has advanced the possibility of 
achieving what Bush termed "the record" of human 
accomplishment. The URL (uniform resource locator) 
format evolved to become a persistent identifier for a 
digital object. Termed digital object identifier (DOI), it may 
include such properties as an ISSN for a journal-level 
link. Furthermore, the CrossRef ^ initiative— founded and 
directed by publishers— contains DOIs and metadata, 
including the online locations of objects. This initiative 
enables web scale discovery search engines to link 
authenticated institutional users to local library holdings. 
For our interviewees, it made the DOI "come alive" and 
helped "get me the article"''^ and "find it in the library." 

The OpenURL standard, advanced by the National 
Information Standards Organization (NISO), builds on this 
technology. Established as ANSI standard Z39.88-2004, 
this protocol effectively contains two parts: first, a base 
URL (Z39.88), which refers to the location of OpenURL 
resolver software deployed by, for instance, an academic 
library; second, a context object (2004), which describes 
the item of interest using an agreed syntax, thereby 
permitting identification of additional items of interest. 
In a complementary fashion, a United Kingdom Serials 
Group/NISO initiative known as KBART (Knowledge 

Bases and Related Tools) guides standardizing data and 
practices for ERM knowledge bases that populate library 
website A-Z lists and link resolvers. These initiatives not 
only illustrate the wide-ranging interests and activities 
across the scholarly information community— libraries, 
publishers, ERM vendors, data standards, standards 
organizations, platform vendors, among others— but also 
suggest the complexity of coordinated efforts required 
to attain current levels of reliability and quality across 
multiple information flows, which, if exploited fully, "offer 
a nicely oiled chain— technology working with and for the 

Additional international initiatives are concurrently 
advancing the development of other facets of scholarly 
communications. For instance, the author DOI — like the 
content DOI, which permanently tracks an object (be it a 
book, an article, a chapter, a graph, etc.)— would trace a 
scholar across all of his or her work, whether as a primary 
author of a text, a peer reviewer, or an authoritative 
commenter In another initiative, an overlay kitemark 
would track versions of record in a world where digital 
preprint, postprint, revised, copied, and republished 
versions abound. Named for the British Standards 
Institution certification schemes indicating quality and 
adherence to standards, the kitemark could contain 
metadata ranging from the type of peer review an article 
underwent to the retraction or revision of any citations. 
A complementary initiative advanced by representatives 
from all areas of the community is ORCID (Open 
Researcher and Contributor ID),''^ which aims to provide 
researchers and other entities with unique identifiers to 
associate with their research outputs. Version of record 
is also being addressed'^ to ensure that researchers 
have visibility into the various incarnations of a journal 
article through its life cycle of publication and can locate 
the authoritative and most recent version of a given 
work. NISO has recommended standard version terms, 
and CrossRef has released a new feature for version 
validation, called CrossMark. 

Meanwhile, webmasters are increasingly adopting 
schemas such as HTML to construct (i.e., mark up) web 
pages in ways recognized by major search engines, such 
as Bing and Google. When these search providers directly 
access databases structured by standardized schema, 
they can improve discovery of relevant web pages. 
Within the scholarship realm, ScholarlyArticle offers a 
structured data schema to enable improved discovery 

of appropriate creative content tlirougli consideration 
of a variety of unique properties, including publislier, 
editor, reviewer, genre, reviews, ratings, institution, 
iocation, creation date, and modification date, as well as 
author, title, and source^^— aii value-added signifiers of 
provenance and authority. Since journal publishers began 
providing online access to full-text scholarly articles 
in the late 1990s— thus triggering a revolution in the 
scholarly communications process— these cross-sector 
advancements have assumed growing importance. 

Library Discovery Evolution 

For centuries, card catalogs facilitated access to the 
monographic literature. As information and computer 
sciences evolved in the 1970s and early 1980s, 
automated library systems were introduced to replace 
them. Earliest OPACs (online public access catalogs) 
enhanced the search functionalities of traditional card 
catalogs by offering Boolean search functionalities. In 
the late 1990s and early twenty-first century, library 
vendors developed federated search solutions; these 
simultaneously searched, retrieved, and displayed 
content from various remote information hosts— such 
as abstracting and indexing (A&l) services and full-text 
databases— but with limited success. In addition, they 
were typically difficult or time-consuming to configure and 
maintain. Later in the decade, library catalogs evolved 
into their next generation, offering increased intuitive 
functionality, integration with open web services, and 
user interfaces mimicking popular websites, such as This generation of catalogs also provided 
the capacity to harvest records from locally hosted 
library silos of information. In short, these systems 
offered new discovery layer options, uncoupled from any 
specific underlying integrated library system, nowadays 
comprising a variety of highly coordinated library 
management system modules. 

More recently, Google Scholar's release in 2004 led to the 
competitive development of web scale discovery services 
for the library environment. In 2009, Serials Solutions 
announced the development of such a resource when 
it unveiled its web scale discovery tool. Summon. Other 
vendors soon followed with similar products, such as 
EBSCO's Discovery Service and Ex Libris's Prime Central. 
These products more easily connect researchers with 
the library's vast information repository, including locally 
held and hosted content, as in physical holdings, digital 

collections, and local institutional repositories. Perhaps 
more significant, web scale discovery enables access 
to a widespread array of remotely hosted content, often 
purchased or licensed by the library, such as publisher 
and aggregator content for tens of thousands of full-text 
journals, additional content from A&l resources, and 
content from open-access repositories. This is made 
possible by preharvesting and centrally indexing content 
sourced across multiple silos, thereby streamlining 
discovery and delivery of content. In other words, "web 
scale discovery can be considered as deep discovery 
within a vast ocean of content . . . normalized into an 
underlying schema developed by the discovery service 
vendor that facilitates indexing, relevancy ranking, and 
even level of presentation for different content types with 
potentially varying levels of metadata,"^^ searching a 
broader collection than what the local library may own or 

Scholarly Ecosystem Shifts 

Web scale discovery and visibility tools depend on 
value-added, largely invisible contributions of authors, 
publishers, librarians, and vendors who compose the 
scholarly value chain. In this symbiotic ecosystem, 

• librarians manage systems for institutional 
collection, dissemination, and retrieval of the 
scholarly corpus; 

• publishers produce and promote authors' work 
through formats findable on the open web and in 
library catalogs; 

• publishers' technology vendors supply 
e-publication platforms and strategic 
discoverability solutions; and 

• libraries' technology vendors connect publishers' 
digital content to OPACs through ERM systems 
and web scale discovery services. 

Traditionally, these content and service providers satisfied 
complementary roles: publishers provided gatekeeper 
services, ensuring peer-reviewed content adjudicated by 
peer-reviewed editorial boards; in turn, librarians served 
as access gatekeepers for the published authoritative 
resources. However, the Internet has disturbed those 
comfortable and conventional relationships, thereby 
necessitating reinvention of centuries-old partnerships 

mindful of the mandate to make scholarly content 
widely "discovered or discoverable." This now involves 
search engine optimization (SEO) and search engine 
interoperability to promote effective crawling, indexing, 
and ranking by search engines— "thinking, in other words, 
about the robot users of our systems as well as the 
human users."''^ 

The purpose for optimizing online products for search 
engines is essentially to improve their visibility to readers 
and researchers of all kinds. This challenges publishers to 
invest in technically sound SEO strategies as a standard 
element of editorial and operational divisions, which can 
disturb standard business practices. Publishing house 
staff must grow and maintain actionable knowledge of 
SEO techniques, which regularly fluctuates as online 
technologies and the businesses that offer them advance. 
Publishers must also continually monitor the successful 
discovery of their products through sites like Google and 
Bing, and make rapid modifications to content platforms 
and online products to keep pace with the changeable 
landscape of online searching. 

Publishers are equally concerned with effectively 
mapping their products for use within the diverse arena 
of library products and services. Unique library website 
designs and OPACs come in wide varieties. In addition, 
to ensure quality discoverability of their products within 
the library ecosystem, publishers must now produce 
quality secondary data for ERM vendors. Traditionally, 
generation of this metadata was the purview of A&l 
services. Today, however, publishers must fulfill the 
expectation to deliver free bibliographic data at purchase, 
without any assurance that libraries will use these data 
in uniform ways— if at all. Publishers must meet the 
resource demands for library indexing and cataloging 
requirements in staff knowledge and time as well as 
systems and equipment. To scale these functions, 
publishers must overcome manual maintenance routines 
and establish automated content management systems 
that allow metadata deliveries to vendors that are both 
cost effective and time efficient. Investment in XML-based 
technologies has also become a standard infrastructural 
addition to most publishing houses.^" 

In contrast, discovery of and access to content remains 
important for libraries, in librarians' opinion^^— despite 
growing faculty perceptions that libraries' value 
resides in their "buyer" function, which increasingly 
"disintermediates" libraries from scholarly research 

processes.2^ Traditionally, this role was expressed through 
a combination of effective cataloging and classification, 
open and browsable stacks, A&l tools, reference/research 
support, instructional programs, and other services that 
improve the range and quality of information available 
in and through libraries. In a discovery environment 
increasingly dominated by web search services, such 
as Google and Bing, libraries are struggling to perform 
their discovery role amid increasingly complex changing 
workflows, licensure restrictions, statistics analysis, 
and return-on-investment expectations. Despite 
these obstacles— further exacerbated by uncertain 
and declining budgets^^— libraries are in increasing 
numbers implementing web scale discovery platforms 
that manage local access through a single index that 
provides relevancy ranking, facets for drilling deeply into 
search results, user ability to write or read summaries 
and read or add editorial comments, and agnostic 
access to content in all forms. Furthermore, all this can 
occur in mobile mode because companies such as Ex 
Libris, EBSCO, OGLC, and Serials Solutions partner with 
growing numbers of publishers of primary and secondary 
content (scholarly corpus and A&l services, respectively) 
to produce simplified, centrally indexed content, amid 
growing recognition in all scholarly value chain sectors of 
the importance of web scale discovery services. 

As a consequence, libraries can now replicate the 
centralized search model of Google's search interface 
and speed, content breadth, and quality results, thereby 
finally addressing the vexing question, if Google can do it, 
why can't libraries? Although the implications for libraries 
are not fully understood in terms of implementing web 
scale discovery services, at least one published study 
reports a dramatic decrease in the use of traditional A&l 
databases and an equally dramatic increase in the use 
of resources from full-text database and online journal 
collections.^" In anticipating this phenomenon, an A&l 
vendor responded in an industry survey, "These services 
may expose our content to users who would never think 
to choose our database for their search, and my fear is 
that if we are not 'in,' then we are well and truly 'out.' On 
the other hand, we may lose brand recognition and if their 
usage reporting isn't sophisticated enough, how will the 
library know that it was our database that navigated the 
user to the full text? So we risk losing out that way too."^^ 
Similarly, within a library context, when a link resolver 
enables Google Scholar, it eliminates the need for a user 

to understand the distinctions among databases^^— 
reflective of the dilemma that "while authors and 
readers want us to be invisible, libraries, publishers, and 
vendors want constituencies to recognize our value. "^'^ 
Contributors throughout the value chain experience such 
uncertainties in the wake of a former library monopoly on 
access to peer-reviewed scholarship. 

Shared Aspirations and Accomplishments 

As a consequence, publishers, libraries, and vendors 
must necessarily explore the following: "In these days 
where users are searching across huge amounts of 
information with free web tools, how can we support 
discovery of the quality vetted and peer reviewed content 
that libraries invest in and scholars require at appropriate 
points in their workflow?"^^ In echoing that publisher's 
sentiments, two discovery service leaders phrased the 
quandary thusly: "How can you make searching the 
library as easy as searching the Web?"^^ and "The users 
are comfortable with the open web and the Googles 
of the world. We need to make our services just as 
natural and easy to use."^° This shared cross-sector 
aspiration requires expanded partnerships to promote 
discoverability and visibility— that is, "Can I find it?" and 
"Can it find me?"^^ 

Discoverability requires content to be well indexed 
and well represented. Ideally, metadata would be 
continually enriched through the supply chain as they 
pass from author to publisher to platform to ERM vendor 
to discoverability service to library and, finally, to the 
end user. In response, publishers have evolved best 
practices for metadata, "depositing it anywhere they 
will accept it,"^^ such as RSS feeds for library vendors. 
Routine iterative testing now generates new publisher 
website design practices that ensure optimum search 
engine optimization, measured by assessment tools 
with increasingly sophisticated success metrics. Many 
platform providers that partner with publishers further 
discovery through content enrichment and regular 
usability testing that ensures that online content is 
well presented— whether on a publisher's website or a 
university catalog, whether at home or work, whether 
through Google Scholar or PubMed. 

Visibility involves placing information in locations where 
people will come across it in the work that they do. In 
response, publishers and others have initiated various 
Web 2.0 efforts to further engage online content— for 

instance, Facebook pages and blogs dedicated to 
individual publications (e.g., journals) or to cohorts of 
scholars and authors within particular fields of study. In 
addition, publishers are beginning to explore enhanced 
information environments for novice researchers- 
displaying encyclopedia entries alongside journal articles 
and developing search widgets to populate library 
sites^^— as a supplement to other end user support 
services. Finally, in response to growing demand from 
mobile device owners, contributors across the value 
chain are developing mobile websites, apps, and related 

As a consequence to the increased pressures for 
institutional libraries to demonstrate outcomes and 
impact and maximize resource usage, best practices have 
evolved in recent years through adoption of COUNTER 
(Counting Online Usage of Networked Electronic 
Resources) and SUSHI (Standardized Usage Harvesting 
Initiative)^" for content access and web analytics for 
user behavior. Value chain interviewees concurred that 
additional discussion on enhanced metrics exploring, 
among other dimensions, the matter of completeness and 
currency would enhance the practical use of such data.^^ 
As expressed by one journal aggregator vendor, "how 
do you measure what isn't found?"^^ Such sentiments 
point to the heightened level of aspiration needed to take 
discoverability and visibility to the next phase. 

Collaborative Conversations Leading to Better 
Practices and Next Steps 

Despite considerable progress and impressive goodwill, 
much work remains. Libraries and commercial entities 
need to find new ways of working together. Again, this 
proposal is timely, given that web statistical services such 
as Google Analytics demonstrate that researchers are 
increasingly using many pathways to discover content. 
To improve user experiences, value chain contributors 
spanning the full range must share "what they want 
and need from one another,"^'' including specific 
functions, best practices, unmet goals, and collaborative 
recommendations. Drawing from expert cross-sector 
interview data, the following recommendations highlight 
optimism for future collaborations, with the promise 
to enhance discoverability through changed industry 
standards that will catalyze and crystallize new best 

For publishers and vendors: 

• Initiate cross-platform, cross-publisher 
investigations to identify best industry practices, 
further share standards, and apply researcher 
behavior findings, then revise online product 
and publisher website designs based on these 
cooperative efforts. 

• Become more conversant with how libraries 
operate so that they can more successfully 
advance local discoverability through 
improved records workflow, acquisitions 
functions, statistics management, and systems 

• Implement more open, standardized approaches 
to online hosting that allows published content 
to be used as a platform upon which others can 
innovate, such as 

o CrossMark standard to signal to the 
user which version of a scholarly 
item— that is, of the many versions- 
is in fact the archival, published one; 

o Machine-readable Creative 

Commons license tagging to guide 
usage privileges and attribution 

For publishers and librarians: 

• Vigilantly monitor knowledge of researcher 
needs and habits (which will inevitably change 
as discovery and delivery functions evolve) to 
improve the connections between readers and 

• Collaborate on metadata enrichment and 
successful ingestion into library systems, such 
as OPACs, and coordinate about routine testing 
to ensure that all holdings are visible and easily 

• Productively collaborate on improved means 
of teaching novice and expert researchers to 
use existing systems,"^ with the aim of building 

systems that are better suited to the way that 
researchers want to behave. 

For all members of the scholarly 
communication industry: 

• Consider what new discoverability services, 
givien general-purpose search engines access to 
metadata records for indexing purposes, could 
be leveraged from search engine utilities. For 
example, widespread adoption of ScholarlyArticle 
tagging, found at, is an especially 
promising initiative, as is standardizing the 
metadata embedded in HTML and PDF versions 
of an article.''^ 

• Revisit how business is done based on the 

o First, the difference between library 
patron and consumer is blurring. 
Most users do not recognize 
exactly where content is served or 
stored, and they may be willing to 
directly pay all or part of the cost to 
secure the needed information. For 
these reasons, more varied pricing 
structures need to evolve. 

o Second, content is fluid in an online 
ecosystem, where users may want 
only a sentence or a page out of a 
whole publication. The conversation 
in the value chain therefore needs 
to consider expanded copyright 
solutions. If such barriers were 
removed, libraries may save money; 
publishers may uncover new revenue 
streams; and end user access may 

• Further cross-industry standards for content file 
formats, quality of metadata, and usage statistics 
to ensure interoperability among search engines, 
publisher platforms, and integrated library 
systems, especially as new models for scholarly 
communication develop.*" 



The development of more sophisticated discovery and 
visibility strategies very much depends on heightened 
cross-sector collaborations. The conversations proposed 
above suggest some especially promising topics for 
discussion, which surfaced during interviews with sector 
experts. Such exchanges on improvements in web scale 
discovery are timely, as the technical prerequisites, 
shared standards, and best practices for significantly 
enhanced search performance have either been 
developed or are in development. At the same time, new 
forms of scholarship are emerging, and user experience 
expectations are accelerating — intensifying the need for 
value chain contributors to Initiate boundary-crossing 
inquiries that benefit scholarship. Librarians know the 
research and discovery needs of their patrons; publishers 
and editors understand the curation, production, and 
dissemination of scholarly content; and vendors provide 
necessary technological infrastructure through platform, 
discovery, and organizational tools - however, each does 
not sufficiently understand the perspective of the others. 
Collaboration across the academic value chain is critical 
if we are to realize our collective potential and catalyze 
knowledge generation for today's scholars. 


Gratitude and praise is offered to those who graciously 
gave their time to participate in this research project and 
share their thoughts on the changing face of information 
discovery. Their insights, suggestions, and referrals to 
other individuals, readings, and blogs have informed 
this white paper and furthered the very cross-sector 
conversations we strongly recommend. 

• Kimberly Armstrong, Deputy Director, Center for 
Library Initiatives 

• Mike Buschman, Director, Product Management, 
Serials Solutions 

• Lettie Conrad, Online Product Manager, SAGE 

• Michael Gorrell, Senior Vice President, Chief 
Information Office, EBSCO Publishing 

• David Horowitz, Vice President of Sales, SAGE 

• Simon Inger, Simon Inger Consulting, Ltd. 

• Suzanne Kemperman, Director, Publisher 
Relations, Business Development Group, OCLC 

• George Machovec, Interim Executive Director, 
Colorado Alliance of Research Libraries 

• Ed McBride, Director of Library Sales, SAGE 

• Elena NIkitina, Executive Director of Journals 
Marketing, SAGE 

• John Sack, Associate Publisher and Director, 
HighWire Press, Stanford University 

• Martha Sedgwick, Senior Manager, Online 
Products Team, SAGE 

• Ron Snyder, Technology and Research Manager, 

• Jabin White, Vice President of Content 
Management, ITHAKA 

Further Reading 

Beall, Jeffrey. "How Google Uses Metadata to Improve 
Search Results." The Serials Librarian 59, no. 1 (2010): 

Bilder, Geoffrey. Social Media and Scholarly 
Communication. Oxford, UK: ISMTE, 2010. 

Calhoun, Karen, Joanne Cantrell, Peggy Gallagher, and 
Janet Hawk. Online Catalogs: What Users and Librarians 
Want-An OCLC Report. Columbus, OH: OCLC, 2009. 

CIBER. Social Media and Research Workflow. London: 
UCL, 2010. 

Collins, Maria, and Jill E. Grogg. "Building a Better 
ERMS." Library Journal 136, no. 4 (2011): 22-28. 

Connaway, Lynn S., and Timothy J. Dickey. The Digital 
Information Seeker: Report of the Findings from Selected 
OCLC, RIN, and JISC User Behavior Projects. London: 

Higher Education Funding Council for England, 2010. 
reports/201 0/digitalinformationseel<erreport.pdf. 

Gray, Catherine. "E-journals: Their Use, Value and 
Impact— Final Report." January 18, 2011. 
http://www.rin. ac.ul</our-work/communicating-and- 

Head, Alison J., and Michael B. Eisenberg. Truth Be Told: 

How College Students Evaluate and Use Information 

in the Digital Age. Seattle, WA: Information School, 

University of Washington, 2010. 

htt p ://p roj ect i nf o I it . o rg/pdf s/P I L_Fal 12 1 0_S u rvey_ 


Inger, Simon, and Tracy Gardner. "How Readers 
Navigate to Scholarly Content— Comparing the 
Changing User Behaviour between 2005 and 2008 
and Its Impact on Publisher Web Site Design and 
Function." September 9, 2008. 

Kenneway, Melinda. "Author Attitudes toward Open 
Access Publishing." TBI Communications on behalf of 
InTech Open Access Publisher April 27, 201 1 . http:// 1 .pdf. 

Maron, Nancy L., and K. Kirby Smith. Current Models 
of Digital Scholarly Communication — Results of an 
Investigation Conducted by ITHAKA for the Association 
of Research Libraries. Washington, DC: Association of 

Research Libraries, 2008. 

Register, Renee, Kevin Cohn, Les Hawkins, Helen 
Henderson, Regina Reynolds, Steven C. Shadle, William 
Hoffman, Sri Rajan, and Paoshan W. Yue. "Metadata in 
a Digital Age: New Models of Creation, Discovery, and 
Use." The Serials Librarian 56, nos. 1-4 (2009): 7-24. 

Research Information Network. "Social Media: A Guide 
for Researchers." February 7, 201 1 . 

Schonfeld, Roger C. "Faculty Survey 2009: Key Strategic 
Insights for Libraries, Publishers, and Societies." April 7, 

Smit, Eefke, and Maurits van der Graaf. "Journal 
Article Mining: A Research Study into Practices, 
Policies, Plans . . . and Promises." Commissioned 
by the Publishing Research Consortium. May 201 1 . 

Wittenberg, Kate. "The Role of the Library in 21st Century 
Scholarly Publishing." In No Brief Candle: Reconceiving 
Research Libraries for the 21st Century. Washington, DC: 
Council on Library and Information Resources, 2008. 
http://www.clirorg/pubs/reports/pub1 42/pub1 42.pdf. 


1 . Simon Inger, interview, September 8, 201 1 . 

2. Inger, interview. 

3. Maureen Donovan, "Networl<ing and the Changing Environment for 
Academic Research," in Scholarly Practice, Participatory Design and 
the Extensible Catalog, ed. Nancy Fried Foster, Katie Clark, Kornelia 
Tancheva, and Rebekah Kilzer (Chicago: ALA, 201 1), 51-74. 

4. John Sack, interview, July 1 5, 201 1 . 

5. Suzanne Kemperman, interview, October 5, 201 1 . 

6. Lettie Conrad, "Discovering Authoritative Reference Material: It's All 
about 'Location, Location, Location,'" in E-reference Context and 
Discoverability in Libraries: Issues and Concepts, ed. Sue Polanka 
(Hershey, PA: IGI Global, 2011), 137-47. 

7. Sudatta Chowdhury, Forbes Gibb, Monica Landoni, "Uncertainty 
in Information Seeking and Retrieval: A Study in an Academic 
Environment," Information Processing and Management 47 (201 1): 

8. Vannevar Bush, "As We May Think," Atlantic Monthly, July 1945, 
http://www.theatlantic.eom/magazine/archive/1 945/07/as-we-may- 

9. Bush, "As We May Think." 

1 0. Sack, interview. 

1 1 . Bush, "As We May Think." 

12. CrossRef, 

13. Ross Maclntyre, "The Technologies That Oil the Supply Chain," 
Serials 24, no. 1 (2011): 89-92. 

14. Maclntyre, "The Technologies That Oil." 

15. ORCID, 

16. Lettie Conrad, "Journal Article Versioning Is Harder Than It Looks . . 
. or Should Be!" Against the Grain 23, no. 2 (201 1): 20-21 . 


18. Jason Vaughan, "Web Scale Discovery: What and Why?" Information 
Technology & Libraries (201 1), http://digitalcommons.library.unlv. 
edu/lib_artlcles/44/ or 
prepub/vaughan201 1 .pdf. 

19. Lorcan Demsey, "Effective Web Presence . . . Lorcan Demsey's 
Weblog," May 31 , 201 1 , 

20. Conrad, "Discovering Authoritative Reference Material." 

21 . Matthew P. Long and Roger C. Schonfeld, "Ithaka S+R Library 
Survey 2010: Insights from U.S. Academic Library Directors," 

22. Roger C. Schonfeld, "Faculty Survey 2009: Key Strategic Insights 
for Libraries, Publishers, and Societies," April 7, 2010, http:// 

23. Publishers Communication Group, "Library Budget Predictions for 
201 1 — Results from a Telephone Survey," August 2010, http://www. 1 .pdf. 

24. Doug Way, "The Impact of Web-Scale Discovery on the Use of a 
Library Collection," Serials Review 36 (2010): 214-20. 

25. National Federation of Advanced Information Services, "NFAIS 
Survey on Discovery Services," April 2010, 

26. Carol P. Diedrichs, "Discovery and Delivery: Making It Work for 
Users," The Serials Librarian 56, nos. 1-4 (2009): 79-93. 

27. Sack, interview. 

28. Martha Sedgwick, interview, August 15, 2011. 

29. Mike Buschman, Interview, July 13, 2011. 

30. Michael Gorrell, interview, August 8, 2011. 

31 . As paraphrased from Sack, interview. 

32. Sedgwick, Interview. 

33. Sedgwick, Interview. 

34. COUNTER, "About COUNTER," 

35. Sack, interview. 

36. Jabin White, interview, September 28, 201 1 . 

37. Sack, interview. 

38. Kim Armstrong, interview, September 6, 2011. 

39. Sack, interview. 

40. Lettie Conrad, interview, June 30, 201 1 . 

41 . Sedgwick, Interview. 

42. Sack, interview. 

43. As paraphrased from Suzanne Kemperman interview. 

44. Multiple interviews with experts in publishing, librarlanship, content 
architecture and archiving, and related services, completed in the 
course of this research, 201 1 .